bigdataspain.org

THANK YOU FOR AN AMAZING CONFERENCE!

The 3rd edition of Big Data Spain in Nov 2014 was a resounding success.
Watch the video below and find out why our attendees, speakers, partners and friends turned Big Data Spain into one of the largest events in Europe about Hadoop, Spark, NoSQL and cloud technologies.

FILTER BY TAGS

  • MongoDB
  • GraphDB
  • NoSQL
  • Spark
  • Hadoop
  • Analytics
  • ML
  • IoT
  • Streaming
  • English
  • Spanish
  • Cybersecurity

Reset

  • 08:30 ~ 09:00
    • Registration & Coffee~ Hall & Espacio Buñuel ~
  • 09:00 ~ 09:30
  • 09:30 ~ 10:15
    • Sala 5
      • Business & Technical

      Apache Spark and OSS technologies by Paco Nathan @ Databricks

      English

      Paco NathanDirector of Community Evangelism Databricks
      How does Apache Spark fit within the landscape of Big Data technologies? These technologies have been changing abruptly, and this talk explores where that appears to be headed.

      On the one hand, the economics of datacenter technologies has shifted toward warehouse scale with commodity hardware that implies multicore and large memory spaces. Incumbent technologies do not embrace those changes, while Spark and related OSS projects work in concert to leverage them. On the other hand the shape of the data requirements are changing abruptly with sensor data, microsatellites, and other IoT use cases boosting data rates by orders of magnitude. We have decades-old advanced math techniques available to address these industrial needs, but how will our software frameworks keep pace?

      This talk addresses the effective integration of newer OSS technologies for the technical audience, while providing guidance to the business audience to understand fundamental drivers for these changes — how the use cases for Big Data are extending. We will consider the roles played by functional programming (making complex workflows tractable), by cloud-based notebooks (a new wave of flexibility and collaboration), as well as how some of the advanced math has enormous implications on real-time analytics at scale.
  • 10:15 ~ 11:00
    • Sala 5
      • Business & Technical

      Data warehouse modernization programme by IBM

      AnalyticsEnglish

      Toby WoolfeBig Data Solutions Leader IBM
      General Motors (GM) is in the process of constructing a single global information warehouse that will become the foundation for all business analytics and decision support across the enterprise.
  • 11:00 ~ 11:30
    • Coffee Break~ Espacio Buñuel ~
  • 11:30 ~ 12:15
    • Sala 5
      • Business

      MongoDB for your Big Data strategy by Norberto Leite

      MongoDBNoSQLEnglish

      Norberto LeiteSolutions Architect | Eng mongoDB Inc
      When one starts analysing the BigData technology spectrum we can find several different solutions for several different purposes. This is may cause confusion, uncertainty and doubts on what to chose and what for. Both on technical and business decision makers. This talk is to shed some light on where you should consider MongoDB for your BigData strategy and how to make the most out of the dominant technologies of the field.
    • Sala 4
      • Technical

      Hue for Hadoop integrates with Impala & Spark by Cloudera

      HadoopEnglish

      Enrico BertiUI Engineer Cloudera's Hue
      Open up big data to your user base! Hadoop brings many data crunching possibilities but also comes with a lot of complexity.
  • 12:15 ~ 13:00
    • Sala 5
      • Business

      Trafodion SQL-on-HBase for transactional workloads

      HadoopEnglish

      Rodrigo MerinoSenior Presales Solution Architect Hewlett-Packard
      - Opensource project for transactional SQL database capabilities on Hadoop - Trafodion extends Hadoop to provide guaranteed transactional integrity, enabling new kinds of big data applications to run on Hadoop.
    • Sala 4
      • Technical

      Storing and processing data in Hadoop by Jacek Juraszek

      HadoopEnglish

      Jacek JuraszekExpert Java/Hadoop Grupa AllegroJarosław GrabowskiSenior Java/Hadoop Developer Grupa Allegro
      It is a fact that hadoop is not "The ultimate solution". Data processing is hard task but switching to big data where there are much more scaling issues is even harder. Most classic approaches are not taking into account data passing around the network and therefore it is not enough in new landscape.
  • 13:00 ~ 13:15
    • Short Break~ Espacio Buñuel ~
  • 13:15 ~ 14:00
    • Sala 5
      • Business

      APIs, IoTs, big data, analytics and cognitive computing

      AnalyticsIoTEnglish

      Andy ThuraiTechnologist IBM
      The birth of a sophisticated Internet of Things has catapulted hybrid data collection, which mixes structured and unstructured data, to new heights. The goal with any analytics software is to find and improve better data sets rather than spending time in identifying, prepping, cleaning, and preparing the data.
    • Sala 4
      • Workshop

      Introduction to Stratio Streaming

      StreamingSpanish

      David MoralesBig Data Architect StratioAntonio NavarroBig Data Developer Stratio
      Nowadays data-intensive processes and organizations of all sorts require the use of real-time data with increasing flexibility and complexity. We created Stratio Streaming to meet this demand.
  • 14:00 ~ 14:45
    • Lunch Break~ Espacio Buñuel ~
  • 14:45 ~ 15:15
  • 15:15 ~ 16:00
    • Sala 5
      • Technical

      Next-Generation NoSQL Data Stores – HyperDex

      English

      Emin Gün SirerFounder Hyperdex Hyperdex
      Distributed key-value stores are now a standard component of high-performance web services and cloud computing applications. While key-value stores offer significant performance and scalability advantages, the first wave of NoSQL stores typically compromise on consistency, fault-tolerance, performance, or functionality, and sometimes on all four.

      This talk will present HyperDex, a novel, open-source, distributed key-value store developed by my group that provides (1) strong consistency guarantees, (2) fault-tolerance for failures and partitions affecting up to f nodes, and (3) a rich API which includes ACID transactions and a unique search primitive that enables queries on secondary attributes. HyperDex achieves these properties through the combination of three recent technical advances called hyperspace hashing, value-dependent chaining and linear transactions. Despite offering stronger guarantees than first-gen NoSQL data stores, HyperDex is also a factor of 2-13 faster than Cassandra and MongoDB.

      This talk will outline these techniques, identify new research directions, and discuss how these breakthroughs relate to the oft-quoted, but mostly misunderstood, CAP credo.

      Bio:
      Emin Gun Sirer is an Associate Professor of Computer Science at Cornell University. His current research focuses on infrastructure services for large-scale distributed systems, such as key-value stores, graph databases, and consensus protocols. He is also interested in self-organizing systems and cryptocurrencies.
    • Sala 4
      • Workshop

      Google Cloud Platform to predict football matches

      MLEnglish

      Jordan TiganiGoogle Software Developer - BigQuery Google
      Predict the future with machine learning over sports data. Open source tools to DIY.
  • 16:00 ~ 16:45
  • 16:45 ~ 17:00
    • Drinks Break~ Espacio Buñuel ~
  • 17:00 ~ 17:45
    • Sala 5
      • Technical

      Real time analytics with MapReduce and in-memory

      AnalyticsEnglish

      Dr. William L. BainCEO at ScaleOut Software, Inc. ScaleOut Software, Inc.
      Operational intelligence represents an important new step in the evolution of data analytics by integrating analytics into live systems to provide immediate feedback and identify emerging patterns. Object-oriented, in-memory models of real-world systems enable their behavior to be tracked and analyzed in real time using data-parallel computing techniques. This technique builds on the technology of Hadoop MapReduce but has important differences which enable real-time analysis of live data.
    • Sala 4
      • Business

      Hands on Machine Learning for a Business audience

      MLEnglish

      David GersterChief Data Scientist BigML
      Hands on Machine Learning for a Business audience
  • 17:45 ~ 18:30
    • Sala 5
      • Technical

      Large-scale graphs with Google(TM) Pregel

      English

      Michael HacksteinFront End and Graph Specialist ArangoDB
      - Gives rules of thumb to decide if Pregel is useful for the attendees use-case - Shows which criteria the attendee has to keep an eye on when making a technology decision - Shows some tips & tricks when implementing a Pregel algorithm
    • Sala 4
      • Workshop

      BigInsights and streams: IBM Hadoop solution

      AnalyticsSpanish

      Luis ReinaData Specialist IBM
      In this workshop Luis Reina will show 2 Tools that comes with IBM BigInsights:

      1) BigSheets is a way to generate Hadoop Applicaitons (map/reduce) without programming so the final user can analyze big data without the need of knowing Java or other Hadoop languages as PIG. Bigsheets is a browser-based tool that is included in the InfoSphere? BigInsights? Console, to analyze and visualize big data. BigSheets uses a spreadsheet-like interface that can model, filter, combine, and chart data collected from multiple sources, such as an application that collects social media data by crawling the Internet.

      2) BigSQL provides broad SQL support that is typical of commercial databases. You can issue queries using JDBC or ODBC drivers to access data that is stored in Hadoop, in the same way that you access databases from your enterprise applications. You can use the Big SQL server to execute standard SQL queries. Multiple queries can be executed concurrently.

      Big SQL provides support for large ad hoc queries by using MapReduce parallelism and point queries, which are low-latency queries that return information quickly to reduce response time and provide improved access to data.
      The IBM Hadoop edition enriches the standard Hadoop platform with high value features, such as BigSQL, Big Sheets, Text Analytics and others.

      Keywords: Hadoop, Big SQL, security

      Two takeways points of the session:
      • 1. BigInsights is made on standard Apache Hadoop platform
      • 2. BigInsights brings high value features on the standard platform
  • 18:30 ~ 19:15
    • Sala 5
      • Technical

      Benchmarking Big Data systems by Cloudera

      AnalyticsEnglish

      Yanpei ChenPerformance Engineering ClouderaGwen ShapiraSoftware Engineer Cloudera
      You should assess performance marketing claims for the technical rigor of their metrics and measurement methods. When running benchmarks, generating numbers is easy, understanding how to interpret the numbers you have is the real challenge. We will show you how to critically check your own benchmarks for common mistakes.
    • Sala 4
      • Technical

      Geoquery massive amounts of HDFS data from Spark processes

      MLEnglish

      Marc PlanagumàLead data engineer and researcher BDigital
      This session aims to address the specific challenges of exploiting large amounts of geospatially enabled data by reviewing how researchers at BDigital Technology Centre have designed and implemented a stack for advanced Machine Learning on Urban Data and providing a way to geoquery massive amounts of HDFS data from Spark processes without hindering the overall system performance.
  • 19:15 ~ 20:00
    • Drinks~ Barra ~
  • 08:30 ~ 08:55
    • Coffee~ Espacio Buñuel ~
  • 09:00 ~ 09:45
  • 09:45 ~ 10:30
    • Sala 5
      • Business & Technical

      BigQuery for Genomics by Felipe Hoffa at Google

      AnalyticsEnglish

      Felipe HoffaDeveloper Relations engineer BigQuery team Google
      How big is the human genome? What tools can be used to manage and understand it?

      It turns out that the same SQL powers that Google BigQuery makes available for general usage can be applied to genomics. In this session we'll introduce the basics of managing genomes with our favorite big data tools, drawing parallels with more traditional use cases like analyzing view logs.

      Takeaways:

      • The same SQL constructs that help us understand the world, can help us understand the basic fabrics of life.
      • Live demoes will highlight how we can leverage the latest in Google tools and services to accelerate data insights, bringing them from batch to real interactive time.
  • 10:30 ~ 11:15
  • 11:15 ~ 11:45
    • Coffee Break~ Espacio Buñuel ~
  • 11:45 ~ 12:30
    • Sala 5
      • Business & Technical

      Internet of Things & Large scale Data Analysis by Amazon

      IoTEnglish

      Andreas ChatzakisAWS Solutions Architect Amazon Web Services UK Ltd
      This session describes how to build large-scale data collection and processing architectures on AWS, and shows how to use an Intel Galileo board to collect sensor data that is sent to backend processing services such as Amazon Kinesis or Amazon Redshift.
  • 12:30 ~ 13:15
    • Sala 5
      • Technical

      Graph use-cases with RDBMS and NOSQL stores by Jim Webber

      NoSQLEnglish

      Jim WebberChief Scientist Neo Technology
      The evolution of graphs as a primary pillar of the data movement, and will contrast graph use-cases with RDBMS and contemporary NOSQL stores, with a slight detour through distributed systems
    • Sala 4
      • Workshop

      Machine Learning to predict low risk loans by BigML

      AnalyticsMLEnglish

      Poul PetersenCIO BigML
      Traditionally, analyzing big data with machine learning tools has been prohibitively complex and expensive. In this session you will see how BigML makes machine learning more accessible than ever thanks to it's well defined workflow, insightful visualizations, and fully featured REST API.

      Using only a browser, we will develop a system to predict low risk loans using the rich data available from Lending Club. Techniques applied will include dataset transformations, random decision forests, clustering, anomaly detection, batch predictions, evaluations and more.

      An IPython notebook will be provided that utilizes BigML’s API to easily repeat every step taken during the session.

      Requirements for attendees to follow along: • BigML account
  • 13:15 ~ 13:30
    • Short Break~ Espacio Buñuel ~
  • 13:30 ~ 14:15
    • Sala 5
      • Technical

      Sinfonier real-time analytics and cybersecurity by Telefonica

      AnalyticsStreamingEnglishCybersecurity

      Fran GómezSecurity Area Telefónica
      Never in the history of the world has so much information been available at our fingertips. Yet taking advantage of this ever-increasing amount of information is hampered by the lack of dynamic and user-friendly technologies that provide real-time processing capabilities, a serious handicap in a business where time is always critical.
    • Sala 4
      • Workshop

      Introduction to Neo4j Workshop by Jim Webber

      GraphDBEnglish

      Jim WebberChief Scientist Neo Technology
      In 45 short minutes, Neo4j's Chief Scientist Jim Webber will take you through the fundamentals of Graphs, Graph Modelling and Neo4j's Cypher query language, culminating in a live delivery of a realistic retail recommendations system.
      Come along to see how things that would take months with legacy data technology take minutes with graphs, and leave with a good basic grounding in how to repeat the approach in your data.
  • 14:15 ~ 15:00
    • Lunch Break~ Espacio Buñuel ~
  • 15:00 ~ 15:30
    • Sala 5
      • Business

      EMC Pivotal, Stratio and GFT Group round table BigDataSpain

      English

      In 2013, the conference Big Data Spain managed to convey the message that Big Data does not have to be the expensive, complicated and cumbersome dragon that corporations did not dare to deal with.
      As in all hype cycles, the meaning of buzzwords like Big Data often gets lost in translation. Are case studies the best way to explain Big Data to the top management and decision makers?
      How is the enterprise world adopting Big Data in 2015? Which are the good news and the challenges ahead?
  • 15:30 ~ 16:15
    • Sala 5
      • Business & Technical

      Analysis of tourists in Madrid & Barcelona | Big Data Spain

      AnalyticsEnglish

      Albert SolanaConsultant RocaSalvatella
      Present a new methodology for improved analysis and knowledge of the Spanish tourism industry with real data gathered from cellphones and credit card transactions. Differentiate what it means “to have the data” like Telefónica Móviles España or BBVA did, from “to analyze the data” like Telefónica I+D did, or “to know what specific questions to make to the data” like RocaSalvatella did.
    • Sala 4
      • Workshop

      Stratio Crossdata: a SQL-like language for streaming queries

      NoSQLSparkSpanish

      Álvaro AgeaBig Data Architect StratioDaniel HigueroBig Data Architect Stratio
      Crossdata is a distributed peer-to-peer fault-tolerant framework that unifies the interaction with batch and streaming sources supporting multiple datastore technologies.
  • 16:15 ~ 17:00
  • 17:00 ~ 17:15
    • Drinks Break~ Espacio Buñuel ~
  • 17:15 ~ 18:00
  • 18:00 ~ 18:45
    • Sala 5
      • Technical

      Stateless dataflows MapReduce Spark

      MLEnglish

      Raúl Castro FernándezComputer Science PhD student Imperial College
      Stateful Dataflow Graphs (SDG) are a new dataflow representation that introduces 'state' explicitly in the dataflow. This talk includes reflections about this new programming model---its virtues and limitations---and a short discussion about a now long pursuit of “the Big Data Language”.
    • Sala 4
      • Workshop

      Install HyperDex NoSQL clusters and API by Robert Escriva

      NoSQLEnglish

      Robert EscrivaCo-founder Hyperdex
      This workshop will provide a ground-up introduction to HyperDex. Topics demonstrated in this session include:

      - Installing HyperDex - Deploying a cluster - Exploring HyperDex's rich API - Scaling the cluster horizontally - Backing up and restoring a running cluster

      Keywords: NoSQL, Document Store, Key-Value Store

      Bio:
      Robert Escriva is the co-founder and chief architect at HyperDex, a next-generation data and document store that provides high performance, fault tolerance, and strong consistency guarantees. He is broadly interested in building infrastructure for large scale distributed systems
  • 18:45 ~ 19:15
    • Sala 5

      Closing Session

  • 19:15 ~ 20:00
    • Drinks~ Barra ~

* Subject to changes and adjustments

Join our Newsletter